PII Detection - 12 Model Benchmark Report
300 Test Cases (Base 200 + Advanced 100) · V1 Full Prompt · FP8 Quantization · NVIDIA L40S 46GB
F1 Score Comparison
Detailed Statistics
| Model | Cases | Perfect | Accuracy |
Precision | Recall | F1 |
TP | FP | FN |
Latency |
Confusion Matrix (Document-Category Level)
TP = PII exists & detected (good), TN = No PII & not detected (good), FP = No PII but detected (false alarm), FN = PII exists but missed (privacy risk)
Sensitivity = TP/(TP+FN), Specificity = TN/(TN+FP)
Per-Category Confusion Matrix
| Category |
TP | TN |
FP | FN |
Sensitivity | Specificity |
Case Browser
Model
Dataset
Result
PII
Case Study: Qwen3-30B-A3B Error Analysis
Qwen3-30B-A3B (MoE 30B, 3B active) — 불완전 케이스 심층 분석
실패 패턴 분류